GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification
نویسندگان
چکیده
Graph neural networks (GNNs) have achieved great success in node classification tasks. However, existing GNNs naturally bias towards the majority classes with more labelled data and ignore those minority relatively few ones. The traditional techniques often resort over-sampling methods, but they may cause overfitting problem. More recently, some works propose to synthesize additional nodes for from nodes, however, there is no any guarantee if generated really stand corresponding classes. In fact, improperly synthesized result insufficient generalization of algorithm. To resolve problem, this paper we seek automatically augment massive unlabelled graph. Specifically, \textit{GraphSR}, a novel self-training strategy significant diversity which based on Similarity-based selection module Reinforcement Learning(RL) module. first finds subset are most similar second one further determines representative reliable via RL technique. Furthermore, RL-based can adaptively determine sampling scale according current training data. This general be easily combined different models. Our experiments demonstrate proposed approach outperforms state-of-the-art baselines various class-imbalanced datasets.
منابع مشابه
On Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملError back-propagation algorithm for classification of imbalanced data
Classification of imbalanced data is pervasive but it is a difficult problem to solve. In order to improve the classification of imbalanced data, this letter proposes a new error function for the error backpropagation algorithm of multilayer perceptrons. The error function intensifies weight-updating for the minority class and weakens weight-updating for the majority class. We verify the effect...
متن کاملIntelligent Rule Mining Algorithm for Classification over Imbalanced Data
Association rule mining for classification is a data mining technique for finding informative patterns from large datasets. Output is in the form of if-then rules containing attribute value combinations in antecedent and class label in the consequent. This method is popular for classification as rules are simple to understand and allow users to look into the factors leading to a specific class ...
متن کاملAn Improved Algorithm for SVMs Classification of Imbalanced Data Sets
Support Vector Machines (SVMs) have strong theoretical foundations and excellent empirical success in many pattern recognition and data mining applications. However, when induced by imbalanced training sets, where the examples of the target class (minority) are outnumbered by the examples of the non-target class (majority), the performance of SVM classifier is not so successful. In medical diag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i4.25622